在这项工作中,我们为UNET体系结构引入了一个受生物学启发的远程跳过连接,该连接依赖于混合图像的感知幻觉,是同时编码两个图像的图像。早期编码器特征与更深的解码器的融合允许UNET模型产生更细粒度的密集预测。尽管在细分任务中经过证明,但由于这些远程跳过连接还会导致纹理转移伪像,因此网络的好处对于密集的回归任务进行了下降加权。特别是为了深度估计,这损害了光滑度,并引入了假正边,这是由于深度地图的平滑性质而对任务有害的。拟议的Hybridskip连接显示在平衡边缘保存之间的权衡方面的性能得到了改善,以及损害光滑度的纹理转移伪像的最小化。这是通过分别在高频和低频,编码器和解码器特征之间提供的信息的适当和平衡的信息来实现的。
translated by 谷歌翻译
球形摄像机以整体方式捕获场景,并已用于房间布局估计。最近,随着适当数据集的可用性,从单个全向图像中的深度估计也取得了进展。尽管这两个任务是互补的,但很少有作品能够并行探索它们以提高室内几何感知,而那些这样做的人则依靠合成数据或使用过的小型数据集,因为很少有选项可供选择,包括两个布局。在真实场景中的注释和密集的深度图。这部分是由于需要对房间布局进行手动注释。在这项工作中,我们超越了此限制,并生成360几何视觉(360V)数据集,该数据集包括多种模式,多视图立体声数据并自动生成弱布局提示。我们还探索了两个任务之间的明确耦合,以将它们集成到经过单打的训练模型中。我们依靠基于深度的布局重建和基于布局的深度注意,这表明了两项任务的性能提高。通过使用单个360摄像机扫描房间,出现了便利和快速建筑规模3D扫描的机会。
translated by 谷歌翻译
将带家具的房间图像转换为背景的任务 - 仅是非常具有挑战性,因为它需要在仍然保持整体布局和风格的同时进行大量变化。为了获得照片 - 现实和结构一致的背景,现有的深度学习方法使用图像修复方法或将场景布局的学习作为个人任务,以后在不完全可分辨率的语义区域自适应归一代化模块中利用它。为了解决这些缺点,我们将场景布局生成视为特征线性变换问题,并提出了一个简单但有效的调整后的完全可分辨率的软语义区域 - 自适应归一化模块(SoftSean)块。我们展示了现实和深度估计任务的缩短和深度估计任务中的适用性,在那里我们的方法除了减轻培训复杂性和不可差异性问题的优点,超越了定量和定性的比较方法。我们的SoftSean块可用作现有辨别和生成模型的液位模块。在vcl3d.github.io/panodr/上提供实现。
translated by 谷歌翻译
在这项工作中,我们为计算机愿景任务提供了分发换档基准;单眼深度估计。我们的差异化是对野外数据的不受控制测试的更广泛分配转变的分解,以三个不同的分布换档。具体地,我们通过合成生成数据,并分析它们以产生协变量(颜色输入),之前(深度输出)和概念(其关系)分布换档。我们还综合组合并展示了每个对地址的确实是一个不同的挑战,因为堆叠它们产生增加的性能下降,并且不能使用标准方法水平寻址。
translated by 谷歌翻译
Merging satellite products and ground-based measurements is often required for obtaining precipitation datasets that simultaneously cover large regions with high density and are more accurate than pure satellite precipitation products. Machine and statistical learning regression algorithms are regularly utilized in this endeavour. At the same time, tree-based ensemble algorithms for regression are adopted in various fields for solving algorithmic problems with high accuracy and low computational cost. The latter can constitute a crucial factor for selecting algorithms for satellite precipitation product correction at the daily and finer time scales, where the size of the datasets is particularly large. Still, information on which tree-based ensemble algorithm to select in such a case for the contiguous United States (US) is missing from the literature. In this work, we conduct an extensive comparison between three tree-based ensemble algorithms, specifically random forests, gradient boosting machines (gbm) and extreme gradient boosting (XGBoost), in the context of interest. We use daily data from the PERSIANN (Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks) and the IMERG (Integrated Multi-satellitE Retrievals for GPM) gridded datasets. We also use earth-observed precipitation data from the Global Historical Climatology Network daily (GHCNd) database. The experiments refer to the entire contiguous US and additionally include the application of the linear regression algorithm for benchmarking purposes. The results suggest that XGBoost is the best-performing tree-based ensemble algorithm among those compared. They also suggest that IMERG is more useful than PERSIANN in the context investigated.
translated by 谷歌翻译
Networks have become indispensable and ubiquitous structures in many fields to model the interactions among different entities, such as friendship in social networks or protein interactions in biological graphs. A major challenge is to understand the structure and dynamics of these systems. Although networks evolve through time, most existing graph representation learning methods target only static networks. Whereas approaches have been developed for the modeling of dynamic networks, there is a lack of efficient continuous time dynamic graph representation learning methods that can provide accurate network characterization and visualization in low dimensions while explicitly accounting for prominent network characteristics such as homophily and transitivity. In this paper, we propose the Piecewise-Velocity Model (PiVeM) for the representation of continuous-time dynamic networks. It learns dynamic embeddings in which the temporal evolution of nodes is approximated by piecewise linear interpolations based on a latent distance model with piecewise constant node-specific velocities. The model allows for analytically tractable expressions of the associated Poisson process likelihood with scalable inference invariant to the number of events. We further impose a scalable Kronecker structured Gaussian Process prior to the dynamics accounting for community structure, temporal smoothness, and disentangled (uncorrelated) latent embedding dimensions optimally learned to characterize the network dynamics. We show that PiVeM can successfully represent network structure and dynamics in ultra-low two-dimensional spaces. It outperforms relevant state-of-art methods in downstream tasks such as link prediction. In summary, PiVeM enables easily interpretable dynamic network visualizations and characterizations that can further improve our understanding of the intrinsic dynamics of time-evolving networks.
translated by 谷歌翻译
Functionality and dialogue experience are two important factors of task-oriented dialogue systems. Conventional approaches with closed schema (e.g., conversational semantic parsing) often fail as both the functionality and dialogue experience are strongly constrained by the underlying schema. We introduce a new paradigm for task-oriented dialogue - Dialog2API - to greatly expand the functionality and provide seamless dialogue experience. The conversational model interacts with the environment by generating and executing programs triggering a set of pre-defined APIs. The model also manages the dialogue policy and interact with the user through generating appropriate natural language responses. By allowing generating free-form programs, Dialog2API supports composite goals by combining different APIs, whereas unrestricted program revision provides natural and robust dialogue experience. To facilitate Dialog2API, the core model is provided with API documents, an execution environment and optionally some example dialogues annotated with programs. We propose an approach tailored for the Dialog2API, where the dialogue states are represented by a stack of programs, with most recently mentioned program on the top of the stack. Dialog2API can work with many application scenarios such as software automation and customer service. In this paper, we construct a dataset for AWS S3 APIs and present evaluation results of in-context learning baselines.
translated by 谷歌翻译
Designing powerful adversarial attacks is of paramount importance for the evaluation of $\ell_p$-bounded adversarial defenses. Projected Gradient Descent (PGD) is one of the most effective and conceptually simple algorithms to generate such adversaries. The search space of PGD is dictated by the steepest ascent directions of an objective. Despite the plethora of objective function choices, there is no universally superior option and robustness overestimation may arise from ill-suited objective selection. Driven by this observation, we postulate that the combination of different objectives through a simple loss alternating scheme renders PGD more robust towards design choices. We experimentally verify this assertion on a synthetic-data example and by evaluating our proposed method across 25 different $\ell_{\infty}$-robust models and 3 datasets. The performance improvement is consistent, when compared to the single loss counterparts. In the CIFAR-10 dataset, our strongest adversarial attack outperforms all of the white-box components of AutoAttack (AA) ensemble, as well as the most powerful attacks existing on the literature, achieving state-of-the-art results in the computational budget of our study ($T=100$, no restarts).
translated by 谷歌翻译
In this work, we study the numerical solution of inverse eigenvalue problems from a machine learning perspective. Two different problems are considered: the inverse Strum-Liouville eigenvalue problem for symmetric potentials and the inverse transmission eigenvalue problem for spherically symmetric refractive indices. Firstly, we solve the corresponding direct problems to produce the required eigenvalues datasets in order to train the machine learning algorithms. Next, we consider several examples of inverse problems and compare the performance of each model to predict the unknown potentials and refractive indices respectively, from a given small set of the lowest eigenvalues. The supervised regression models we use are k-Nearest Neighbours, Random Forests and Multi-Layer Perceptron. Our experiments show that these machine learning methods, under appropriate tuning on their parameters, can numerically solve the examined inverse eigenvalue problems.
translated by 谷歌翻译
In contrast to the rapid digitalization of several industries, agriculture suffers from low adoption of smart farming tools. While AI-driven digital agriculture tools can offer high-performing predictive functionalities, they lack tangible quantitative evidence on their benefits to the farmers. Field experiments can derive such evidence, but are often costly, time consuming and hence limited in scope and scale of application. To this end, we propose an observational causal inference framework for the empirical evaluation of the impact of digital tools on target farm performance indicators (e.g., yield in this case). This way, we can increase farmers' trust via enhancing the transparency of the digital agriculture market and accelerate the adoption of technologies that aim to secure farmer income resilience and global agricultural sustainability. As a case study, we designed and implemented a recommendation system for the optimal sowing time of cotton based on numerical weather predictions, which was used by a farmers' cooperative during the growing season of 2021. We then leverage agricultural knowledge, collected yield data, and environmental information to develop a causal graph of the farm system. Using the back-door criterion, we identify the impact of sowing recommendations on the yield and subsequently estimate it using linear regression, matching, inverse propensity score weighting and meta-learners. The results reveal that a field sown according to our recommendations exhibited a statistically significant yield increase that ranged from 12% to 17%, depending on the method. The effect estimates were robust, as indicated by the agreement among the estimation methods and four successful refutation tests. We argue that this approach can be implemented for decision support systems of other fields, extending their evaluation beyond a performance assessment of internal functionalities.
translated by 谷歌翻译